Decision Trees

Question 1

Screenshot taken from Coursera

Answer

  • We are picking feature x3, because it has the lowest classification error
    • At row 3, x3 = 1, y = -1, only 1 error compare to other features.

Question 2

Screenshot taken from Coursera

Answer


In [2]:
import graphlab
graphlab.canvas.set_target('ipynb')

x = graphlab.SFrame({'x1':[1,0,1,0],'x2':['1','1','0','0'],'x3':['1','0','1','1'],'y':['1','-1','-1','1']})
x


Out[2]:
x1 x2 x3 y
1 1 1 1
0 1 0 -1
1 0 1 -1
0 0 1 1
[4 rows x 4 columns]

In [3]:
features = ['x1','x2','x3']
target = 'y'   

decision_tree_model = graphlab.decision_tree_classifier.create(x, validation_set=None,
                                target = target, features = features)


WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.
Decision tree classifier:
--------------------------------------------------------
Number of examples          : 4
Number of classes           : 2
Number of feature columns   : 3
Number of unpacked features : 3
+-----------+--------------+-------------------+-------------------+
| Iteration | Elapsed Time | Training-accuracy | Training-log_loss |
| 1         | 0.001000     | 1.000000          | 0.634946          |
+-----------+--------------+-------------------+-------------------+
  • The best feature to split on first is x3
  • In this tree below you will see that starting from x3 = 1, the depth of the tree is 3.

In [4]:
decision_tree_model.show()



In [11]:
decision_tree_model.show(view="Tree")


Question 3

Screenshot taken from Coursera


In [14]:
# Accuracy 
print decision_tree_model.evaluate(x)['accuracy']


1.0

Question 4

Screenshot taken from Coursera

Question 5

Screenshot taken from Coursera

Question 6

Screenshot taken from Coursera

Question 7

Screenshot taken from Coursera

Question 8

Screenshot taken from Coursera

Question 9

Screenshot taken from Coursera

Question 10

Screenshot taken from Coursera

Question 11

Screenshot taken from Coursera